AITopics | Suceava County

Collaborating Authors

Suceava County

Automatic Input Rewriting Improves Translation with Large Language Models

arXiv.org Artificial IntelligenceFeb-23-2025

Can we improve machine translation (MT) with LLMs by rewriting their inputs automatically? Users commonly rely on the intuition that well-written text is easier to translate when using off-the-shelf MT systems. LLMs can rewrite text in many ways but in the context of MT, these capabilities have been primarily exploited to rewrite outputs via post-editing. We present an empirical study of 21 input rewriting methods with 3 open-weight LLMs for translating from English into 6 target languages. We show that text simplification is the most effective MT-agnostic rewrite strategy and that it can be improved further when using quality estimation to assess translatability. Human evaluation further confirms that simplified rewrites and their MT outputs both largely preserve the original meaning of the source and MT. These results suggest LLM-assisted input rewriting as a promising direction for improving translations.

computational linguistic, rewrite, translation, (13 more...)

arXiv.org Artificial Intelligence

2502.16682

Country:

Asia > Singapore (0.05)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Oceania > Guam (0.04)
(24 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Government > Military (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

RELATE: A Modern Processing Platform for Romanian Language

Păiş, Vasile, Ion, Radu, Avram, Andrei-Marius, Mitrofan, Maria, Tufiş, Dan

arXiv.org Artificial IntelligenceOct-29-2024

This paper presents the design and evolution of the RELATE platform. It provides a high-performance environment for natural language processing activities, specially constructed for Romanian language. Initially developed for text processing, it has been recently updated to integrate audio processing tools. Technical details are provided with regard to core components. We further present different usage scenarios, derived from actual use in national and international research projects, thus demonstrating that RELATE is a mature, modern, state-of-the-art platform for processing Romanian language corpora. Finally, we present very recent developments including bimodal (text and audio) features available within the platform.

international conference, platform, proceedings, (11 more...)

arXiv.org Artificial Intelligence

2410.21778

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.05)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(11 more...)

Genre: Research Report (0.64)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Reddit is all you need: Authorship profiling for Romanian

Ştefănescu, Ecaterina, Jerpelea, Alexandru-Iulius

arXiv.org Artificial IntelligenceOct-13-2024

Authorship profiling is the process of identifying an author's characteristics based on their writings. This centuries old problem has become more intriguing especially with recent developments in Natural Language Processing (NLP). In this paper, we introduce a corpus of short texts in the Romanian language, annotated with certain author characteristic keywords; to our knowledge, the first of its kind. In order to do this, we exploit a social media platform called Reddit. We leverage its thematic community-based structure (subreddits structure), which offers information about the author's background. We infer an user's demographic and some broad personal traits, such as age category, employment status, interests, and social orientation based on the subreddit and other cues. We thus obtain a 23k+ samples corpus, extracted from 100+ Romanian subreddits. We analyse our dataset, and finally, we fine-tune and evaluate Large Language Models (LLMs) to prove baselines capabilities for authorship profiling using the corpus, indicating the need for further research in the field. We publicly release all our resources.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.09907

Country:

Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.05)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
Europe > Romania > Sud-Vest Oltenia Development Region > Dolj County > Craiova (0.04)
(14 more...)

Genre: Research Report (0.40)

Industry: Media > News (0.74)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)

Add feedback

Contextualized AI for Cyber Defense: An Automated Survey using LLMs

Haryanto, Christoforus Yoga, Elvira, Anne Maria, Nguyen, Trung Duc, Vu, Minh Hieu, Hartanto, Yoshiano, Lomempow, Emily, Arakala, Arathi

arXiv.org Artificial IntelligenceSep-20-2024

This paper surveys the potential of contextualized AI in enhancing cyber defense capabilities, revealing significant research growth from 2015 to 2024. We identify a focus on robustness, reliability, and integration methods, while noting gaps in organizational trust and governance frameworks. Our study employs two LLM-assisted literature survey methodologies: (A) ChatGPT 4 for exploration, and (B) Gemma 2:9b for filtering with Claude 3.5 Sonnet for full-text analysis. We discuss the effectiveness and challenges of using LLMs in academic research, providing insights for future researchers.

artificial intelligence, conf, cybersecurity, (13 more...)

arXiv.org Artificial Intelligence

2409.13524

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Oceania > Australia > Victoria > Melbourne (0.05)
Europe > Estonia > Harju County > Tallinn (0.04)
(25 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Metaheuristic optimization of power and energy systems: underlying principles and main issues of the 'rush to heuristics'

Chicco, Gianfranco, Mazza, Andrea

arXiv.org Artificial IntelligenceAug-17-2020

In the power and energy systems area, a progressive increase of literature contributions containing applications of metaheuristic algorithms is occurring. In many cases, these applications are merely aimed at proposing the testing of an existing metaheuristic algorithm on a specific problem, claiming that the proposed method is better than other methods based on weak comparisons. This 'rush to heuristics' does not happen in the evolutionary computation domain, where the rules for setting up rigorous comparisons are stricter, but are typical of the domains of application of the metaheuristics. This paper considers the applications to power and energy systems, and aims at providing a comprehensive view of the main issues concerning the use of metaheuristics for global optimization problems. A set of underlying principles that characterize the metaheuristic algorithms is presented. The customization of metaheuristic algorithms to fit the constraints of specific problems is discussed. Some weaknesses and pitfalls found in literature contributions are identified, and specific guidelines are provided on how to prepare sound contributions on the application of metaheuristic algorithms to specific problems.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2008.07491

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > China > Henan Province > Zhengzhou (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback